home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Power Tools 1993 November - Disc 2
/
Power Tools Plus (Disc 2 of 2)(November 1993)(HP).iso
/
hotlines
/
gsyhl
/
hawhite
/
whitep.txt
< prev
Wrap
Text File
|
1993-05-12
|
62KB
|
1,208 lines
HP'S HIGH AVAILABILITY SOLUTIONS
TABLE OF CONTENTS
INTRODUCTION ..............................................3
HIGH AVAILABILITY REQUIREMENTS ............................4
WHAT CAUSES DOWNTIME? .....................................5
FAULT TOLERANT vs HIGHLY AVAILABLE SYSTEMS.................6
HIGH AVAILABLE SOLUTIONS ..................................6
RELIABLE SYSTEMS .....................................8
DATA AVAILABILITY ...................................11
DISK ARRAYS ....................................11
HP DATAPAIR/800 ................................12
SYSTEM AVAILABILITY ..................................15
SWITCHOVER/UX ..................................15
HIGH AVAILABILITY DIRECTIONS .............................18
FAULT TOLERANT SOLUTIONS..................................21
APPENDIX A
SWITCHOVER/UX OPERATION .............................28
APPENDIX B
SWITCHOVER/UX QUESTIONS AND ANSWERS .................34
HIGH AVAILABILITY SOLUTIONS
INTRODUCTION
The demand for greater system availability expands as businesses
base their operation and competitiveness on mission critical
applications - applications that are integral in the day to day
operation of an organization. Because of market pressures and
user expectations, the need for higher availability is greater
than ever before. Services such as the following depend on
continuous operation of computing resources:
* On-line order entry, funds transfer, message switching and
realtime process monitoring.
* Operational services such as systems development capabilities
and departmental services such as billing and personnel.
* Administrative support applications such as electronic mail,
decision support and word processing.
The loss of computing capabilities can impact production, revenue,
critical decision making, customer satisfaction and even human
life. The Gartner Group summarizes the evolution occurring in the
the 1990s as,
"Technology is approaching 'no-compromise' 7x24x52 (7 days
per week, 24 hours per day, 52 weeks per year) availability
on a spectrum with substantial granularity. Users now can
match required service levels to the 'availability
investment' of a proposed solution."
System downtime can incur the following costs:
* Incomplete or late work resulting in penalties or loss of
customer confidence and satisfaction
* Idle staff, because the system failure isolates employees
from resources necessary to complete their tasks
* Underutilized capacity and loss of business opportunity
* Loss of human life
As the cost associated with downtime increases, greater levels of
availability are financially justified. Hewlett Packard provides
a broad range of availability solutions that includes reliable
components, redundant components, recoverable systems and fault
tolerance.
This white paper discusses how Hewlett Packard is addressing these
three aspects of availability by providing solutions that prevent
and eliminate failures that cause downtime: hardware, software,
operations, and the computing environment.
HIGH AVAILABILITY REQUIREMENTS
When planning a computing environment, there are several crucial
elements which are typically considered. Among the major elements
are performance, scalability, capacity, functionality,
interoperability and availability. Market pressures and user
expectations have forced system designers and implementers to
consider availability as an integral part of the system solution.
Organizations must include higher availability solutions into
their environments in a way which fits the goals for
compatibility, connectivity, integration and cost-effectiveness.
It is possible to measure the likely cost of downtime and
determine the configuration or solution which addresses the level
of availability required.
There are several key requirements which today's computing environments
must incorporate to address the availability needs for an organization's
business. The three most crucial are:
1) data availability, integrity and recovery
2) minimal or no planned downtime
3) minimal unplanned downtime
How a system performs each of these metrics determines how well
the system can meet the defined needs.
The first requirement, data availability and integrity, is the
most critical since it refers to the accuracy, correctness,
consistency and validity of the data. In environments where
computer systems are constantly being relied upon to provide
information for making critical business decisions, data can never
be corrupted or permanently lost.
After the data is guaranteed intact, the second and third
requirements together determine the system's availability.
Planned downtime is the time a system is unavailable for
predetermined periods to perform tasks like system maintenance,
software or hardware upgrades and disk backups. Unplanned
downtime is the time a system is unavailable due to unanticipated
events like hardware failures, environmental errors and operator
errors. In general, the computing resources must be designed so
that neither planned nor unplanned downtime will negatively impact
essential business operations.
WHAT CAUSES DOWNTIME?
In developing a strategy to address the basic system requirements,
it is useful to understand why systems fail. The figure below
illustrates the results of a Gartner Group study on downtime for
both conventional and fault tolerant systems.
As shown in the figure, downtime can be divided into four
categories: hardware, software, people and environment.
Hardware
- Processors
- Peripherals such as disk drives, printers, etc
- Memory
Software
- Operating System
- Layered software products such as transaction processing
monitors and database management systems
- Application programs
People
- Management and operations personnel
- Support engineers
Environment
- Electrical power
- Catastropic such as fire, earthquake, flooding
Because hardware is only one component which contributes to system
failures, it is generally not enough to address the issue of
system availability with just fault tolerant processors. Systems
must be designed to survive a broad spectrum of failures ranging
from software and hardware faults to people and environmental
problems.
This paper details HP's current High Availability and Fault
Tolerant product offering and defines the strategic direction.
Fault Tolerant vs Highly Available Systems
Fault tolerant systems are installed in environments that require
no interruption of user service and absolute data integrity. The
architecture of fault tolerant systems includes fully redundant
components in a single package. The functionality handles
hardware failures and also insures application programs cannot be
corrupted by errors resulting from any single point hardware
failure.
When a failure occurs, the recovery is virtually transparent to
applications and users, with minimal performance degradation.
Components of the system, such as CPU boards or memory can be
added or replaced on line, therefore minimizing the need for
planned downtime.
Highly available systems are sometimes referred to as
fault-resilient systems. They are capable of delivering some of
the features (such as integrity of data on disks) and functions of
fault tolerance, however some system downtime will be associated
with a system failure. Highly available systems are based on
reliable computers, configured to significantly increase the
availability of the system. They quickly recover from both
software and hardware faults within minutes of detecting a
problem. They are typically a loosely coupled configuration.
Highly available solutions which incorporate multiple or redundant
components are able to efficiently utilize the available resources
under normal operating conditions.
HIGH AVAILABILITY SOLUTIONS
The increasing reliance of computing in all phases of business is
driving the need for increased data availability and integrity,
minimization of unplanned downtime and elimination of planned
downtime.
As shown in the following figure, five tiers of solutions exist
today to address system availability issues. The first of these
is to provide extremely reliable systems, systems which provide a
solid basis for operations. Data integrity and increased data
availability can be obtained through the use of disk arrays
operating in high availability mode. To further address data
availability, the third tier provides disk mirroring functionality
through HP Datapair/800. The fourth tier provides nearly
continuous system processing with Switchover/UX, designed to have
a standby system take over for a failed mission-critical primary.
The final tier steps above high availability to the realm of fault
tolerance where the Series 1200 fault tolerant systems provide
absolute data integrity and continuous system processing.
The next several sections will discuss these tiers of availability
solutions.
RELIABLE SYSTEMS
HP's strategy for addressing the growing need for increased
system availability is to provide flexible, highly available
systems. The foundation for the Series 800 Business Servers
is its processors and peripherals which lead the industry in
reliability.
HP's PA-RISC architecture and advanced VLSI technology
dramatically increases reliability by reducing the number of
parts that can fail. Actual field data shows that the MTBF
(Mean Time Between Failure) for HP systems exceeds three years
and the Series 8x7 systems have a projected MTBF of 4.8 years.
System uptime typically exceeds 99.97 percent. Other vendors
typically provide 99.5 percent uptime for conventional systems.
Although the .47 percent may appear insignificant, this equates
to an additional 41.2 hours per year downtime over the HP Series
800 systems.
Our disks also provide a very high MTBF due to HP's outstanding
disk technology. HP manufactures its own drive mechanisms for
use in both HP computers as well as the computers of other large
companies. The quality of drives is now measured in decades
between failures; typical mass storage subsystem can have a
MTBF of 100,000 hours, or over 11 years of continuous operation.
This is a complete subsystem that includes power supply,
controller board, etc. (not just the disk itself).
HP will provide Support Watch, a proactive approach to hardware
support. HP Support Watch detects and reports transient
hardware faults before they result in unplanned downtime. If a
problem is found, the software alerts the HP Response Center
before a failure occurs.
HP quality extends to software as well. HP performs extensive
testing of all major releases of the HP-UX operating system.
The release criteria includes certification for functionality,
usability, reliability, performance and supportability. The
software reliability testing is measured in Continuous Hours of
Operation and is accomplished through extensive feature and path
flow coverage. In addition, recovery capabilities are built
into the leading database management systems with which HP has
premier relationships.
To address unforeseen circumstances such as power outages, the
Series 800 Business Servers support power-fail auto restart.
Power-fail auto restart preserves what is in main memory in the
event of a power interruption so that, upon power restoration,
normal operations can resume. This eliminates any data loss (for
up to 15 minutes).
HP offers and Uninterruptible Power Supply (UPS), from a third
party manufacturer, Deltec. UPS provides a constant flow of
refined, regulated, computer-grade power through virtually any
utility line disturbance. The Deltec UPS protects hardware and
data from blackouts, brownouts, surges and spikes.
Since operator error accounts for a significant amount of
downtime, HP is taking steps to minimize operator intervention.
For example, the Series 800 Business Servers are shipped as an
integrated package which includes the interface cards, operating
system and networking software pre-installed. Another feature
is the autoconfiguration of the available devices into the
system upon boot-up of the system. This reduces the operator
involvement in configuring devices during the system startup.
To effectively manage the disk utilization, the HP-UX operating
system includes the capability to set disk space limits (disk
quotas). This provides a mechanism for controlling and
reporting user utilization of disk space. Other software
solutions such as HP OmniBack and HP OmniBack/Turbo provide
unattended network backup, thereby eliminating any need for
operator intervention and possible operator error.
In summary, HP's reliable systems provide a solid foundation
from which higher availability solutions can be added as
required. This foundation includes PA-RISC architecture,
systems and peripherals with high MTBF, HP Support Watch, high
quality HP-UX operating system software and Power-fail auto
restart.
FUTURE STRATEGY
HP will continue to make the Series 800 systems more highly
available. Several of the strategies which will be used to
accomplish this include online replacement of I/O cards, which
includes disk/tape controllers, data communication links, and
multiplexer cards. Online CPU and memory replacement is also
being investigated for the multiprocessing systems. Also
investigatations are underway to provide a more fault resilient
operating system to isolate application failures and alleviate
system panics.
Additional availability functionality will be provided by
incorporating and adding value to emerging standards. To
minimize operator errors, the utilization of easy-to-use,
intuitive graphical user interfaces will play a major role in
the evolving application development. In addition, utilization
of fault isolation and event notification will be incorporated
within the evolving systems management solutions to further
automate overall operations.
Increased availability can be gained through greater software
and application quality. This is especially critical when
independently-developed applications are brought together for
the first time on a mission-critical system. To address the
software development cycle, HP continues to provide high level
languages, change management tools, and Computer Aided Software
Environment (CASE) solutions. In addition, HP is investigating
online debuggers and editors to facilitate the loading of an
application on a running system.
DATA AVAILABILITY
The second tier for providing higher availability is data
availability. Since data is one of the most important assets of a
corporation, maintaining full access and integrity of the data is
critical to operations. There are two approaches to increasing
data availability: disk arrays and disk mirroring.
DISK ARRAYS
The primary functions of disk arrays are to increase data
availability, to increase total storage capacity, and to provide
performance flexibility by selectively spreading data over
multiple disk drives. HP's disk arrays protect data and allow
uninterrupted access to data in the event of a disk drive
failure. The disk array in high availability mode utilizes a
separate data protection disk to store an encoded form of data
from the other disks in the array. In the event of a disk drive
failure, the encoded disk allows continued access to the data
with no loss of system performance. The failed disk may later
be replaced on-line and the encoded data will be reconstructed
onto the new disk. HP's high availability disk arrays offer a
redundant array of inexpensive disks (RAID) level 3 mode of
operation which offers the best solution to meet HP OLTP
multiuser computer system requirements.
In high availability mode, the disk array appears to the host
system as a single 2.7 or 5.4 gigabyte disk. All the individual
mechanisms work in unison on every request from the host. The
controller queues requests and executes them sequentially.
Since all the mechanisms work together in parallel, the transfer
rate of the array can be as high as four time the transfer rate
of a single disk. This means that the high availability mode
will be most efficient at processing long, large transfers, or
on systems with low I/O (transaction rate) demand on disk
subsystems.
The protection in the disk array is provided at a lower cost
than fully redundant disks used in disk mirroring solutions
since only 25% more disk space is needed to store the encoded
data versus up to 100% more disk space need to mirror data.
However, disk mirroring solution do offer more data protection
in that all cables, interfaces, power supplies and controllers
are duplicated.
HP DATAPAIR/800
HP DataPair/800 prevents data loss by maintaining two copies of
data on separate disks so that the data is still intact after
any single disk or interface card failure. DataPair/800 can
mirror any disk partitions including the root and swap
partitions. It supports mirroring of raw disk access as well as
access through the file system. Mirroring performance depends
upon the mix of disk reads and writes. The mirrored write speed
is slightly slower than unmirrored writes since writes to both
disks must be completed before the mirrored write is complete.
Reads are done from the least busy disk and are optimized to the
point that the driver can achieve more than 100% I/O accesses
per second on read applications.
HP DataPair/800 works transparently to the application. The
disk mirroring software works with the HP-UX kernel to manage
the mirrored disks, so no application modification is required.
HP DataPair/800 allows you to take a disk in a mirrored pair
off-line to perform a backup, while applications continue to
access data from the on-line disk. As the backup is being
performed, the changes which are being made to the on-line disk
are maintained in table memory. When you bring the disk back
on-line after the backup procedure, a fast update is done to
return the mirror to a synchronized state. Application
continuity is maintained during the synchronization state.
You can create or delete a mirror at any time without suspending
your application thus allowing dynamic configuration. HP
DataPair allows you to mirror the data which is critical to the
user or the application. Partitions of the disk or the entire
disk can be mirrored.
The operation of HP DataPair/800 can be done either using the
menu-driven interface of the HP-UX System Administration
Management (SAM) tool or through HP-UX shell commands. HP
DataPair/800 works with HP Fiber Link (HP-FL) technology. Sets
of disks are connected to the system using HP-FL interface
cards. This prevents the I/O interface card from becoming the
single point of failure.
FUTURE STRATEGY
To address the need for increased data availability, HP will
continue to provide additional modes of disk array technology.
There are several modes of RAID technology including a mode to
provide mirroring capability.
HP plans to incorporate Open Systems Foundation (OSF) technology
into the HP-UX operating environment. One such technology is
the Logical Volume Manager (LVM). The LVM provides two primary
pieces of functionality, disk management flexibility and
mirroring capability. In the area of disk management
flexibility, the LVM allows user-defined disk partitions,
concatenation of partitions into volumes, file systems and raw
partitions to span multiple physical disks, the ability to grow
volumes online and dynamically detect, relocate and repair bad
disk sectors.
In the area of high availability, the LVM provides triple
mirroring of volumes across SCSI and/or Fiber Link interfaces.
Triple mirroring allows mirroring, even during an online backup.
HP plans to offer the LVM in 1992.
SYSTEM AVAILABILITY
The next tier to providing higher availability is addressing
system availability. Data availability solutions do not respond
to system interruptions which result from either system hardware
or software failures. In the event of a failure at the system
level, the response should be a automatic and quickly return
operations back to normal conditions.
SWITCHOVER/UX
SwitchOver/UX provides near continuous operation of your mission
critical computing environment. It is a combination of software
functionality and hardware interconnection that enables your
computer system, and applications which run on it, to recover
promptly from failure of a system processing unit (SPU).
Using SwitchOver/UX, you can build highly-available groups of
hosts that provide high system availability. In the event of a
processor failure, a standby processor can automatically begin
providing the services of the failed processor. The standby
processor is connected to the same disks and LAN as the primaries
so it will become an identical replacement for the failed primary.
SwitchOver/UX is ideally suited for environments with the
following characteristics:
1) Require near-continuous operation.
2) Have a mix of critical and non-critical applications.
3) Multiple systems networked together.
4) Transaction based applications.
5) Utilize industry standard databases.
6) Can not justify the added cost of fault tolerance.
While most target environments will have these characteristics,
SwitchOver/UX is flexible enough to be employed in a wide range of
applications.
SwitchOver/UX provides fast recovery without operator
intervention. Using a "heartbeat" message to communicate the
state of the system, the primary or primaries allow the standby
to automatically detect a system failure of a primary and
automatically reboot the standby to become an identical
replacement for the failed primary. Systems configured with
SwitchOver/UX do not require an operator to be present in order
to detect or respond to a failure. Since there is no need for
operator intervention, the recovery time is significantly
decreased.
By having multiple data paths through different processors, LANs
and I/O channels along with disk mirroring, there is no single
point of failure in a loosely-coupled processor configuration.
The individual failure of a component will not bring the system
down for an extended period of time.
The recovery time for a system is dependent on the system size,
configuration and application. The recovery process includes
fault detection, system recovery and application recovery.
(Refer to appendix A for details on the recovery process.)
The standby system is active and can be used to provide
non-critical services. For example, while the primaries are
running critical services in a production environment, the
standby could be used as a development system. When the primary
fails, the processes on the standby can be gracefully shutdown
before the standby begins its recovery process. Once the
primary is repaired it can be used as a standby to continue
development work. By being able to utilize the standby for
services which can be interrupted for a short period of time in
the event of an processor problem, SwitchOver/UX maximizes your
computing investment.
Most database applications are capable of recovering from
reboots. Log files in databases keep track of all transactions
that the database has accepted and committed to disk. When a
system is forced to reboot, either due to a system fault, power
failure or other incident, the database will lose the current
transaction (assuming it has not been completed), but not the
committed transactions. As a result, end users only need to
check on the last transaction they were entering. They do not
need to check on committed transactions because databases are
designed to work with the reboot mechanism SwitchOver/UX
employs.
After the standby takes over for the failed primary, users need
to log back into the system. Users do not need to know that
they are logging into a different system. They can use the same
procedures they followed to login the first time (i.e. they do
not need to use a different machine name to log in).
FUTURE STRATEGY
In mid 1992, SwitchOver/UX will provide interoperabilty with the
Logical Volume Manager, support an FDDI network and operation
with both SCSI and HP-FL disks. To provide a quicker recovery
period, a journaled file system is being investigated to speed
up the system reboot. A journaled file system keeps a copy of
every disk operation that has occurred until the contents of
memory have been written to disk. In the event a system fails,
only the changes made to the disk since the last update are
checked and reconstructed as necessary. This greatly reduces
the file system check phase and allow quick file system
restarts.
HIGH AVAILABILITY DIRECTIONS
DISTRIBUTED AVAILABILITY
HP recognizes the trend toward distributed data processing in the
1990's. Consequently, the horizon of availability solutions must
expand to meet these needs. Based on trends, distributed
availability solutions will begin to play a significant role in
the 1994-1995 timeframe due to the implementation of Client/Server
technology and cooperative processing. In addition, as business
operations are spanning the world the need for system and data
access must be independent of the distance and location.
Distributed availability solutions consider a network of systems,
whether they are across a site or around the world, as a single
environment. The primary goal of distributed availability is to
provide transparent data access and transaction routing in the
event that any one node, communication link, or site fails.
Similar to system-based high availability requirements,
distributed availability must be based on reliable systems and
data communications. Of particular importance is the networking
of systems and the major reliance on data communications. Through
technology defined by the Open Systems Foundation (OSF), HP plans
to incorporate the distributed networking capabilities. HP
currently provides Network Computing Service (NCS) which is part
of the OSF Distributed Computing Environment (DCE). NCS
facilitates the interconnectivity of systems within a heterogeneous
networked environment. HP also will be providing FDDI
capabilities as well as redundant LAN functionality to accommodate
network robustness. Using redundant networking, a failure of a
communications link does not impact data flow since the networking
traffic proceeds through an alternate network link.
OSF has defined a distributed naming service to facilitate the
communication of information throughout a networked environment.
This will allow users to identify, by name, resources such as
servers, files, disks, or print queues and gain access to them
without needing to know where they are located in the network.
This will work in LAN as well as WAN environments. Part of the
distributed naming service is the Andrews Distributed File System
from Transarc. This provides data accessibility across geographic
sites while retaining a consistent and uniform naming convention
without compromising on performance. The systems which encompass
the distributed environment appear as one worldwide file system.
HP plans to take advantage of the Andrews Distributed File System
in future releases of the HP-UX operating system.
The distributed file system also will be designed to enable
administrators to do file system backups while the system is up
and running. The plan is to create a replicated copy of the file
system, then backup the copy. This will allow user to continue
changing the original data while performing required system
administration.
Maintaining data integrity is critical within a distributed
environment. The utilization of transaction processing technology
will provide transaction commitment and concurrency. Using
protocols such as two phase commit, database vendors will maintain
data integrity through the coordination of committed transactions
to the participating distributed systems. Two-phase commit
protocol provides the capability to either commit or abort a
transaction that spans multiple systems. In the event one of the
participating systems in the distributed environment fails during
the course of a transaction commitment, the total commit will be
aborted. This provides transaction integrity across multiple
systems. After a failure, recovery can begin by identifying, from
the transaction monitor log, the uncommitted transactions. HP
plans to take advantage of transaction technology from Transarc to
assist in the coordination and commitment of multiple-site
transactions.
SUMMARY
In order to meet customers' increasing availability needs, HP will
continue its high availability and fault tolerant program through
the following directions: continued enhancement of existing
availability and fault tolerant products, delivery of additional
availability functionality to "round out" the availability
offering, and movement toward providing distributed availability
solutions to meet the needs of the emerging distributed computing
trend. When possible, these directions will leverage emerging
standards either directly or by adding value upon them.
The HP9000 systems offer a broad range of high availability and
fault tolerant solutions today to minimize or eliminate computing
interruption and financial impact due to loss of data or
application availability. HP plans to continue to improve these
solutions with additional availability functionality, and meet the
requirements of distributed computing by providing distributed
availability solutions in the future.
FAULT TOLERANT SOLUTIONS
The Series 1200 family is designed for continuous operations, and
high performance OLTP applications. The family, is based on
Motorola's 32 bit high performance 68K microprocessors. The two
systems, Model 1240 and Model 1245, run the same Operating System,
HP-FX, and are completely object code compatible.
The Model 1240, first introduced in March 1990, is based on
Motorola's 68030 microprocessors and supports 256Kbyte of Cache.
The Model 1245, introduced in August 1991, is a board upgrade to
the Model 1240. Based on Motorola's fastest microprocessor MC68040
(rated at 20 MIPS), the Model 1245 provides up to three times the
performance of HP's low-end Fault Tolerant system.
To achieve higher performance rates, the Model 1245 supports a
larger Cache size, which significantly minimizes the traffic
across the system backplane. All models have a tightly coupled
architecture with processors sharing global memory. This
architecture was designed to support symmetric multiprocessing
where system performance scales linearly as hardware resources are
added. The systems are modular and expandable. Upgrades and
reconfiguration can be done on-line. HP's FT computers are based
on open systems, and conform to major industry standards.
Each one of the primary system resources: Processors, Memory, or
IO Elements can be added to the system on-line, with no
interruption of the application. Up to 32 Processor Elements can
be added to the system, providing up to 600 MIPS. Memory Elements
are designed to support high-throughput applications, and can
support up to 2GB of shadowed memory (4GB of physical memory).
The system supports mainframe type applications, which require
large disk farms, high-speed communications interfaces, and
support for multiple LANs. The system also supports large number
of terminal-based users.
Designed from inception to be a Fault Tolerant platform, the 1200
system architecture ensures applications and data will always be
available.
All critical components are duplicated, to prevent single points
of failure. The system identifies faults within one machine cycle
to prevent any data corruption. The failed component is
immediately isolated (in case of a permanent failure), and the
system transparently returns to normal operation. HP-FX, 1200's
Symmetric Multi Processing Operating System, has been designed
from the ground up to eliminate malfunctions typical to other UNIX
implementations.
The 1200 systems provides an optimum trade off between hardware
and software to assure absolute data integrity. Within one clock
cycle the system detects any faults using embedded hardware
logic. The failed component is then isolated and the system is
logically re-configured using software techniques. This combined
hardware/software approach is an efficient way of handling a
complex problem that occurs seldom but requires immediate action.
Error detection and correction codes are employed in the main
memory, IO and processor cache and in the data buffers on the
system bus, to ensure the integrity of data throughout the system.
Protocol monitors are employed by each element involved in a bus
transfer to ensure data integrity on the bus. In addition, timers
on the processor boards monitor the elapsed time between various
system events, and generate an alarm if the interval exceeds
normal limits.
Self-checking circuitry checks all outputs. For example, each
processor element consists of two processor chips with comparator
logic onboard. All instructions are executed to both chips, with
the results of each compared for consistency. IO elements also
have duplicate processors that self-check each other. Even the
error detection H/W is checked to ensure it properly functions
(who guards the guard ?).
One of the key differentiators of Models 1240/45 is the ability
to deal with both permanent and transient errors. Other systems
often treat transient errors as software errors when in fact they
usually are caused by thresholding hardware. The capability to
deal with transient errors allows the resources of Models 1240/45
to be used much longer than if they were quickly shut down at the
first trace of a fault.
After the hardware detects and isolates an error, HP-FX performs
the "intelligent" recovery operations. Recovery operations
determine if error is permanent or transient. If permanent, or if
the component has exceeded its pre-established threshold, the
system sends the process to the next available processor.
The system relies on duplicate copies of each processes' state in
main memory. When a failure occurs, the system can use this state
to restart the process from the point of the last cache flush. If
it is a transient error, the module is placed back in service.
The system can withstand multiple processor and memory module
failures. If other components of the system fail, the system will
run, but it cannot withstand another failure in that area. Errors
are logged to disks and reported to the system administrator. A
repair request will also be automatically transmitted to the HP
response Center. Fault Tolerance is transparent to application
developers.
HP-FX has been designed from the ground-up to prevent system
failures due to malfunction in the system S/W. Other Fault
Tolerant UNIX systems, based on AT&T's UNIX Kernel, do not address
all the aspects of S/W reliability.
HP-FX has been redesigned to improve on existing UNIX technology.
From its inception, UNIX was not built as a fault tolerant
operating System, nor has it been architected to support Symmetric
Multi Processing. Therefore, HP-FX's Kernel was redesigned to
enable fault tolerant features needed for OLTP computing
environments.
At any time, the system may experience a hardware fault in a PE,
ME, IOE, system bus, controller, or peripheral device. Whenever a
fault occurs, it must be possible to recover the process that
experienced the fault without losing its process state or losing
or duplicating I/O operations. To ensure this, the kernel
guarantees that the state of every process in main memory is
always internally consistent; that the state of kernel data
structures is consistent with the main memory process state; and
that no I/O operations are lost or repeated.
Structured coding practices were put in place to help improve
kernel source readability, and thus expedite bus isolation and
system maintenance. These practices are enforced through software
tools, used to guarantee consistency of HP-FX code.
Hardware dependent code has been separated and is clearly isolated
from generic hardware independent code, thus increasing ease of
defect isolation and repair. Additionally, new software
modifications may be more easily inserted. New hardware
technologies require re-work of only the hardware dependent code.
Models 1240/45 go beyond hardware fault tolerance by providing
features to minimize the chance of the system failing due to
software errors. The primary feature is the run-time assertions.
Originally added to aid debugging, the HP-FX run-time assertions
detect inconsistent process states. Should a comparison with an
assertion reveal an uncertainty, the OS will retry the last
operation.
APPENDIX A
SwitchOver/UX Product Operation
SwitchOver/UX consists of a set of programs running on the primary
and the standby hosts. (A host is the collection of programs and
data used by a processor.) There are two "daemon" programs which
are used to monitor the state of the primary hosts. (A daemon
program is one which runs automatically in the background to
provide certain system services.) A primary host in a highly
available group runs a daemon called heartbeat. This program
periodiclly transmits a message to the standby host over the local
area network. The standby host runs a daemon called readpulse,
which "listens" for the heartbeat messages.
There are five phases to SwitchOver/UX's operation:
1) Normal health checking
2) Fault detection and recovery
3) Application recovery
4) Resume processing
5) Processor Repair
Normal Health Checking During normal operation, each primary in a
loosely-coupled processor group sends out a "heartbeat" across the
LAN to the standby system informing it that everything is
functioning properly (Figure 1). The standby system "listens" for
these heartbeats and as long as it receives these heartbeats, it
will continue to run its own applications.
FIGURE 1
Fault Detection and Recovery
When the standby misses a "heartbeat" from a primary processor, it
assumes the primary has failed and begins the recovery process
(Figure 2). First, the standby processor locks the root disk of
the primary processor. This prevents the primary processor from
inadvertently accessing the disks and possibly corrupting data
once the standby has assumed the responsibilities of the failed
primary. Next, the standby reboots itself using the root of the
failed system. When the standby finishes rebooting, it will also
have the network address of the failed system.
NOTE: Once the standby initiates the recovery process, it can not
be reversed until the processor recovery is completed. Also, any
disks the standby was accessing prior to recovery will not be
accessible until the failed primary is repaired.
FIGURE 2
Application Recovery
After the standby system has finished assuming the identity of the
failed system, the application needs to go through their own
recovery routines. Most databases support recovery from a reboot
and can automatically bring themselves to a consistent state.
Custom applications need to be structured to recover from a
reboot. Applications can be automatically restarted through the
use of HP-UX's standard initialization and start-up files.
Resume Processing
Users log back into the system using their normal procedure, and
re-start their individual applications. Users do not need to know
that they are running on a different processor (Figure 3). For
most database applications, users may need to check on their last
transaction before processing was interrupted. Typically, the
current transaction will be lost when processing is interrupted.
The amount of data loss will depend upon the individual
application and database.
Batch applications which are interrupted will also need to be
re-started. Depending upon the recovery capabilities of the
application, either it will need to be re-started from the
beginning or from a point where the state of the program and data
are consistent.
FIGURE 3
SPU Repair
After the standby has assumed the responsibilities of the failed
primary, it becomes the primary and the failed primary is treated
as a standby system. The failed system is then repaired using
HP's normal repair processes with the assistance of a system
administrator or repair technician. Once the system is repaired,
it can be brought up as a standby for the other primaries or it
can resume being a primary (Figure 4). In order to return the
failed primary to being a primary again, the standby which took
over for the primary will need to be shutdown so the primary can
regain its disks and network address. This switchover should be
scheduled when the system is lightly loaded and users can afford
the brief downtime.
FIGURE 4
Recovery Time
The recovery time for a system is very application dependent.
Figure 5 diagrams the recovery process and shows which stages are
time dependent upon customer applications. There are three key
components to recovery time:
1) Fault detection
2) System recovery
3) Application recovery
FIGURE 5
Fault Detection
Fault detection is determined by the frequency of the heartbeats
and how many heartbeats the standby will allow to go by before it
initiates recovery procedures. These two parameters are definable
by the system administrator. Typically this stage of the recovery
process will take less than one minute.
System Recovery
During system recovery the standby processor reboots and checks
all disk file systems to correct any problems that may have been
created when the primary processor failed. The reboot time is
dependent upon the class of machine (i.e. 827, 832, etc.) and the
amount of RAM memory. The disk checking time depends on how much
disk space is used for the file system and the state these files
were in at the time of the failure. This component may be the
largest part of the recovery time. It may be significantly
reduced by using databases and applications which access the disks
directly and bypass the HP-UX file system. The most recent
versions of industry leading database products typically bypass
the file system (using raw disk) which helps to minimize recovery
times.
Application Recovery
Once the processor has recovered and the file system is intact,
the application needs to perform its own recovery. For databases
this would mean rolling transactions back to a known state and
then rolling them forward, completing all committed transactions
and discarding all incomplete transactions. The time it takes for
this stage is entirely dependent upon the application and the
transaction rate prior to the failure. This portion of the
recovery time can be minimized by choosing databases and
structuring applications to recover quickly from system reboots.
SwitchOver/UX Configuration with DataPair/800
Disk Mirroring (DataPair/800, Product Number 92625A) is not
required for a SwitchOver/UX system, although to maximize your
data availability and integrity in a highly available environment,
it is recommended.
SwitchOver/UX and HP DataPair/800 require Fiber-Link ("FL") disks.
Any Fiber-Link Interface disks may be used, though only disks with
the same product numbers may be mirrored. Up to eight disks can
be connected in one P-bus chain. The P-bus cable chains the disks
together, providing the data path for the standby processor to
access the files on the primary processor's disk(s) if a primary
fails.
Configurations
SwitchOver/UX supports configurations of one processor acting as a
standby for up to seven primary processors. SwitchOver/UX
operates on systems within the same HP 9000 Series 800 system
categories. Each loosely-coupled processor group must have
processors which can boot off the same root and have identical I/O
configurations. There are five system categories. Systems within
the same category can be used together to form loosely-coupled
processor groups. These categories are:
(1) (2) (3) (4) (5)
827S 822S 825S 845S 850S
847S 832S 835S 855S
857S 842S 860S
867S 852S 865S
877S 870/100
870/200
870/300
870/400
Note: SwitchOver/UX is not supported on the 807S, 817S, 837S, 808S
and 815S Systems.
APPENDIX B
SWITCHOVER/UX QUESTIONS AND ANSWERS
General Definitions
SPU - the box containing the processor
Host - the disks and the set of programs run by the SPU
Asymmetric configuration - the primary SPUs are not necessarily
connected to any disks other than their own. The standby SPU is
connected to all disks.
Symmetic configuration - each SPU is connected to all disks.
Primary Host - Composed of programs and data that are
indispensable.
Standby Host - Composed of tasks that may be interrupted at any
moment without causing serious difficulty.
SwitchOver/UX Questions:
Q: How similar must the primary and standby hosts be? For example,
must they have the same disks, I/O cards,etc?
A: There are two types of SwitchOver/UX configurations:
symmetrical and asymmetrical. In the symmetrical configuration
all SPUs have access to all discs. Anyone of the SPUs can be
the designated standby system. Inherent in this configuration,
the card layout must be the same. There can be varying number
of disks associated with the various SPUs. In the asymmetrical
configuration, there is a designated standby system which has
access to all disks in the configuration. The I/O
configuration of the standby SPU must be a superset of all the
primary SPUs it is backing up.
Q: What happens when the LAN fails?
A: There are two scenarios based on the customer configuration.
Scenario 1: Customer has one LAN connection. In this case,
the heartbeat daemon running on the primary will no longer be
able to transmit its message across the LAN. Similarily, the
readpulse daemon running on the standby will no longer hear
messages from the primary. The standby will assume the primary
has failed and begin the takeover process. Scenario 2:
Customer has two LAN networks. When there are two LANs within
the configuration, the heartbeat messages from the primary are
passed over BOTH LAN networks at all times. The standby is
listening for heartbeat messages from either LAN. If one of
the LANs fail, the heartbeat message will continue to be
transmitted over the failed LAN network. The systems will both
continue to operate without interruption. Note, the peripheral
devices (DTCs, workstations, etc.) which are connected to the
failed network may need to be switched.
Q: After a failover how do you restore your switchover
configuration?
A: The process differs based on the configuration. Symmetrical
configuration: Boot the standby host on the repaired primary
spu. This results in the repaired primary spu becomming the
standby spu. Asymmetrical configuration: You will need to
shutdown the SPU running the primary host (prior standby) and
reboot the repaired primary spu to now run the primary host.
Then, boot the standby spu to run the standby host. Since only
the standby spu has full access to all disks within the
configuration, this is your only designated standby spu.
Q: What are the customer deliverables with SwitchOver/UX?
A: Customers will receive one copy of SwitchOver/UX software, two
unique link level addresses and the Managing SwitchOver/UX
Manual per SwitchOver/UX order (P/N 92668A). The two unique
link level addresses are used to reset the hardware link level
address associated with the LAN card. It is necessary for
SwitchOver/UX to operate with a software link level address
versus a hardware link level address. Since the standby SPU
takes over the disks of the failed primary, the link level must
be associated with the host versus the spu. Two addresses are
supplied to accomodate configurations with two LAN networks.
Q: Can you share disks within the SwitchOver/UX configuration?
A: No, you cannot share data disks within your configuration. The
only disk which can be shared in the SwitchOver/UX
configuration is the dump disk. This disk can only be used as
the dump disk; there can be no data residing on this disk when
in shared mode.
Q: Do I need the same user license level on the primaries and the
standby?
A. You do not need to have identical user license levels on all
systems within the configuration. In the event of a
switchover, the standby system will assume the disks of the
failed primary and therefore the user license will be that of
the failed primary host. If users who had been operating on
the standby system want access to a system, they will need to
logon to one of the operational primary systems (based on user
license availability).
Q: After the switchover has occurred, is the standby host
available?
A: No, the disks and programs previously running on the standby
host are not available until the configuration is restored.
Q: Are their restrictions on the systems within a SwitchOver/UX
Configuration?
A: Yes, there are five unique categories of systems for the
SwitchOver/UX configuration:
(1) (2) (3) (4) (5)
827A 822S 825S 845S 850S
847S 832S 835S 855S
857S 842S 860S
867S 852S 865S
877S 870/100
870/200
870/300
870/400
All systems within the SwitchOver/UX configuration must be
within the same category. For example a configuration may be
an 832S and 842S. It cannot include an 845S, 850S; it can only
consist of 822S, 832S, 842S and 852S systems.
Q: How do I switch printers/tapes from one to the other?
A: With the DTC Device Access/ARPA software, printers can be
shared among the systems within the configuration. This allows
continued access of printers and terminals even after a
switchover. HP-IB Tape drives can be manually switched using
the 93550A with an HP-IB switch module. Contact the Sales
Response Center regarding this product. SCSI tape drives
cannot be switched.
Q: How does SwitchOver/UX work with PLC's, SCADA systems, other
data gathering devices?
A. The integrity of data with SwitchOver/UX is up to the last
committed transaction. If the PLC, SCADA or other data
gathering device is in the process of transmitting data to the
S800 system when a failure occurs, that data which has not been
committed will be lost. The data gathering devices often have
limited memory and therefore the transmission of data to the
S800 may be frequent. There could be a user written check on
the data gathering device to determine if the data has been
received by the S800; this could have an impact on the
performance.
Q: How far apart can the P-bused disks be? How far apart can
SwitchOver/UX systems be?
A: 1000m (500 from one SPU to disks in center; 500 meters from
disks to other SPU)
Q: Will SwitchOver/UX run on the 700's and 300's?
A: No, there are no plans at this time.
Q: With SwitchOver/UX, is data in memory lost?
A: If the secondary takes over, the data in memory is lost. In
situations where power is lost to both the primary and the
secondary, the battery backup will kick in, and primary memory
will not be lost.
Q: Does SwitchOver/UX work in a Wide Area Network (WAN)?
A: No, it is not with the current product implementation. It is
not possible for an X.25 modem to change address analogous to
the ethernet address switch technique.
Q: Can there be more than one standby spu in a loosely-coupled
processor group?
A: No, currently only one spu can be configured to support a set
of primaries.
Q: How much overhead is added due to the state-of-health
(heartbeat) daemon?
A: The state-of-health daemon uses much less than 1% of the spu's
resources and LAN bandwidth.
Q: What happens to the processes executing on the standby when
automatic recovery starts?
A: The default method of handling these processes is to terminate
them immediately. This is done in order to speed up the
recovery time. It is also possible to gracefully shut down the
standby system by modifying the scripts provided with
SwitchOver/UX. With these modifications, all processes on the
standby can be cleanly stopped before the standby initiates a
reboot.
Q: Does SwitchOver/UX work with SQL databases?
A: SwitchOver/UX is designed to work with industry standard
databases like Oracle, Informix, Sybase, Ingres, Allbase, etc.
Some databases will perform better than others with
SwitchOver/UX because they can be configured to reduce recovery
times. Databases which support direct disk access and user
definable recovery times work best with SwitchOver/UX.
Informix, Oracle and Sybase access the disk directly. Also,
some databases have parameters which can be set to limit the
time needed to recover the database after an spu reboot.
Q: How does this product impact applications?
A: In general, applications do not need to be modified to take
advantage of SwitchOver/UX. To reduce data loss or speed up
recovery times, you may choose to modify your applications.
For example, to reduce data loss, the application can be
written to prevent the user from entering new transactions
until the previous transaction is committed. This technique
would elimate data loss and speed up recovery, although it
would reduce throughput.
Q: In a SwitchOver/UX configuration with disk mirroring on the
primary, when the standby takes over, are the disks re-imaged?
A: Upon reboot, the disks will be checked to deterime if they were
previously "in synch". If they were and there is no file
corruption, no re-synching is required and none will be done.
Q: When using SwitchOver/UX, can I have something, such as an
operator phone call initiated when the primary system has
failed?
A: Yes. When the primary system has failed, a script file can be
invoked on the standby system. This script can be customized
to the specific requirements of your environment.
Q: When will SwitchOver/UX and DataPair/800 support SCSI?
A: SwitchOver/UX will support SCSI disks with HP-UX Release 9.0.
This is planned for release mid 1992. HP DataPair/800 will not
be enhanced to support SCSI disks. Logical Volume Manager
(LVM) will provide SCSI and HP-FL disk mirroring functionality.